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Abstract. We extend Ide Finettil 's jl937|) notion of exchangeability to finite and count- 
able sequences of variables, when a subject's beliefs about them are modelled using coher- 
ent lower previsions rather than (linear) previsions. We prove representation theorems in 
both the finite and the countable case, in terms of samphng without and with replacement, 
respectively. We also establish a convergence result for sample means of exchangeable 
sequences. Finally, we study and solve the problem of exchangeable natural extension: 
how to find the most conservative (point-wise smallest) coherent and exchangeable lower 
prevision that dominates a given lower prevision. 



1. Introduction 

This paper deals with belief models for both finite and countable sequences of exchange- 
able random variables taking a finite number of values. When such sequences of random 
variables are assumed to be exchangeable, this more or less means that the specific order 
in which they are observed is deemed irrelevant. 



The first detailed study of exchangeability was made by Ide Finettil (Il937h (with the ter- 
minology of 'equivalent' events). He proved the now famous Representation Theorem, 
which is often interpreted as stating that a sequence of random variables is exchange- 
able if it is conditionally independent and identically distribu ted (IID)|] Other importan t 
work on exchangeability was done by, amongst many othe rs, iHewitt and Savagd (1 19551) . 
Heath and Sudderthl(ll976l) . Diaconis and Freedmanl(ll980l) and, in the context of the be- 
havioural theory of imprecise probabilities that we are going to consider here, by Wallevl 
(1199 lb . We refer to iKallenberd (120021 l2005h for modern, measure-theoretic discussions of 
exchangeability. 

One of the reasons why exchangeability is deemed important, especially by Bayesians, 
is that, by virtue of de Finetti's Representation Theorem, an exchangeable mo del can be 
seen as a convex rnixture of multinomial models. This has given some ground (Ide Finettil 
1 937[ 1 1 975t iDawidl 1 1 985h to the claim that aleatory probabilities and IID processes can be 
eUminated from statistics, and that we can restrict ourselves to considering exchangeable 
sequences instead]] 

De Finetti presented his study of exchangeability in terms of the behavioural notion 
of previsions, or fair prices. The central assumption underlying his approach is that a 
subject should be able to specify a fair price P{f) for any risky transaction (which we 
shall call a gamble) f dde Finettil 1 1974 Chapter 3). This is tantamount to requiring that 
he should always be willing and able to decide, for any real number r, between selling the 
gamble / for r, or buying it for that price. This may not always be realistic, and for this 
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bee Ide Finettil fT97i Section 11.4); and'Cifarelli and Reg azzinil (mi) for an overview of de Finetti's work. 

^For a critical discussion of this claim, see Walley ( 1991, Section 9.5.6). 
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reason, it has been suggested that we should expHcitly allow for a subject's indecision, by 
distinguishing between his lower prevision P{f), which is the supremum price for which 
he is willing to buy the gamble /, and his upper prevision P{f), which is the infimum 
price for which he is willing to sell /. For any real number r strictly between P{f) and 
P{f), the subject is then not specifying a choice between selling or buying the gamble / 
for r. Such lower and upper previsions are also subject to certain rationality or coherence 
criteria, in very much the same way as (precise) previsions are on de Finetti's account. 
The resulting theory of coherent lower previsions, sometimes also cal l ed the behavioural 
theory of imprecise probabilities, and brilliantly defended by Walleyl (1991), generalises 
de Finetti's behavioural treatment of subjective, epistemic probability, and tries to make 
it more realistic by allowing for a subject's indecision. We give a brief overview of this 
theory in Section |2] 

Also in this theory, it is interesting to consider what are the consequences of a subject's 
exchangeability assessment, i.e., that the order in which we consider a number of random 
variables is of no consequence. This is our motivation for studying exchangeable lower 
previsions in this paper. An assessment of exchangeability will have a clear impact on the 
structure of so-called exchangeable coherent lower previsions. We shall show they can be 
written as a combination of (i) a coherent (linear) prevision expressing that permutations 
of realisations of such sequences are considered equally likely, and (ii) a coherent lower 
prevision for the 'frequency' of occurrence of the different values the random variables can 
take. Of course, this is the essence of representation in de Finetti's sense: we generalise 
his results to coherent lower previsions. 

A subject's probability assessments may be local, in the sense that they concern the 
probabilities or previsions of specific events or ra ndom variables. Assessments may on 
the other hand also be structural (see ,Walleyi, ll99U Chapter 9), in which case they specify 
relationships that should hold between the probabilities or previsions of a number of events 
or random variables. One may wonder if (and how) it is possible to combine local with 
structural assessments, such as exchangeability. We show that this is indeed the case, and 
give a surprisingly simple procedure, called exchangeable natural extension, for finding 
the point-wise smallest (most conservative) coherent and exchangeable lower prevision 
that dominates the local assessments. As an example, we use our conclusions to take a 
fresh look at the old question whether a given exchangeable model for n variables can be 
extended to an exchangeable model for n + k variables. 

Before we go on, we want to draw attention to a number of distinctive features of our 
approac h. First of all, the usual proofs of the Represe nta tion Theorem, suc h as the ones 
given bv ldeFinettl (Il937h . iHeath and SudderthI (Il976h . or lKallenber^ (l2005h . do not lend 
themselves very easily to a generalisation in terms of coherent lower previsions. In princi- 
ple it would be possible, at least in some cases, to start with the versions already known for 
(precise) previsions, and to derive their counterparts for lower previsions using so-called 
lower envelope theorems (see Section|2]for more details). This is the method that Walle^ 
(Il99ll Sections 9.5.3 and 9.5.4) suggests. But we have decided to follow a different route: 
we derive our results directly for lower previsions, using an approach based on Bernstein 
polynomials, and we obtain the ones for previsions as special cases. We believe this method 
to be more elegant and self-contained, and it certainly has the additional benefit of drawing 
attention to what we feel is the essence of de Finetti's Representation Theorem: specifying 
a coherent belief model for a countable exchangeable sequence is tantamount to specify- 
ing a coherent (lower) prevision on the linear space of polynomials on some simplex, and 
nothing more. 
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Secondly, we shall focus on, and use the language of, (lower and upper) previsions for 
gambles, rather than (lower and upper) probabilities for e vents. Our er nphas is on previsio n 
or expectation, rather than probability, is in keeping with de Finetti ■'s 097?) and lWhittlel 's 



(Eqoo) approach to probabilistic modelling. But it is not merely a matter of aesthetic pref- 
erence: as we shall see, in the behavioural theory of imprecise probabilities, the language 
of gambles is much more expressive than that of events, and we need its full expressive 
power to derive our results. 

The plan of the paper is as follows. In Section|2] we introduce a number of results from 
the theory of coherent lower previsions necessary to understand the rest of the paper. In 
Section[3] we define exchangeability for finite sequences of random variables, and establish 
a representation of coherent exchangeable lower previsions in terms of sampling without 
replacement. In SectionS] we extend the notion of exchangeability to countable sequences 
of random variables, and in Section |5] we generalise de Finetti's Representation Theorem 
(in terms of multinomial sampling) to exchangeable coherent lower previsions. The re- 
sults we obtain allow us to develop a limit law for sample means in Section |6] Section [T] 
deals with exchangeable natural extension: combining local assessments with exchange- 
ability. In an appendix, we have gathered a few useful results about multivariate Bernstein 
polynomials. 

2. Lower previsions, random variables and their distributions 
In this section, we want to provide a brief summary of ideas, and known as well as new 



results from the theory of coherent lower previsions (iWalleyLll991h . This should lead to 



a better understanding of the developments in th e sections that follow. For results that are 



mentioned without proof, proofs can be found in lWallevi ( 1199 Ih . 



2.1. Epistemic uncertainty models. Consider a random variable X that may assume val- 
ues X in some non-empty set By 'random', we mean that a subject is uncertain about 
the actual value of the variable X, i.e., does not know what this actual value is. But we 
do assume that the actual value of X can be determined, at least in principle. Thus we 
may for instance consider tossing a coin, where X is the outcome of the coin toss, and 
^ = {heads, tails}. It does not really matter here to distinguish between a subject's belief 
before tossing the coin, or after the toss where, say, the outcome has been kept hidden from 
the subject. All that matters for us here is that our subject is in a state of (partial) ignorance 
because of a lack of knowledge. The uncertainty models that we are going to describe here 
are therefore epistemic, rather than physical, probability models. 

Our subject may be uncertain about the value of X, but he may entertain certain beliefs 
about it. These beliefs may lead him to engage in certain risky transactions whose outcome 
depends on the actual value of X. We are going to try and model his beliefs mathematically 
by zooming in on such risky transactions. They are captured by the mathematical concept 
of a gamble on which is a bounded map / from ^ to the set R of real numbers. A 
gamble / represents a random reward: if the subject accepts f, this means that he is willing 
to engage in the following transaction: we determine the actual value x that X assumes 
in and then the subject receives the (possibly negative) reward f{x), expressed in units 
of some predetermined linear utility. Let us denote by Ji!f{^) the set of all gambles on 
SC. 

De Finettil (1 1974 has proposed to model a subject's beliefs by eliciting his fair price. 



or prevision, P{f) for certain gambles /. This P{f) can be defined as the unique real 
number p such that the subject is willing to buy the gamble / for all prices s (i.e., accept 
the gamble f — s) and sell / for all prices t (i.e., accept the gamble t — g) for aWs < p <t. 
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The problem with this approach is that it presupposes that there is such a real number, or, 
in other words, that the subject, whatever his beliefs about X are, is willing, for (almost) 
every real r, to make a choice between buying / for the price r, or selling it for that price. 

2.2. Coherent lower previsions and natural extension. A way to address this problem 
is to co nsider a m odel w hich allows our subject to be undecided for some prices r. This is 
done in IWallevl 's (Il99lh theory of lower and upper previsions. The lower prevision of the 



gamble /, P{f), is our subject's supremum acceptable buying price for /; similarly, our 
subject's upper prevision, P{f), is his infimum acceptable selling price for/. Hence, he is 
willing to buy the gamble / for all prices t < P{f) and sell / for all prices s > P{f), but he 
may be undecided for prices P{f) < p < P{f)- 

Since buying the gamble / for a price f is the same as selling the gamble — / for the 
price — f [in both cases we accept the gamble / — f], the lower and upper previsions are 
conjugate functions: P{f ) = —P{—f) for any gamble /. This allows us to concentrate on 
one of these functions, since we can immediately derive results for the other. In this paper, 
we focus mainly on lower previsions. 

If a subject has made assessments about the supremum buying price (lower prevision) 
for all gambles in some domain J(f, we have to check that these assessments are consistent 
with each other. First of all, we say that the lower prevision P avoids sure loss when 



sup 



k=l 



>0 (1) 



for any natural number n, any gambles /i ,...,/„ in and any non-negative real numbers 
Ai, . . . , A„. When the inequality ([T]) is not satisfied, there is some non-negative combination 
of acceptable transactions that results in a transaction that makes our subject lose utiles, no 
matter the outcome, and we then say that his lower prevision P incurs sure loss. 
More generally, we say that the lower prevision P is coherent when 



sup 



k=\ 



>0 (2) 



for any natural number «, any gambles fo, . . . , f„ in ,J{f and any non-negative real numbers 
Xq, . . . , A„. Coherence means that our subject's supremum acceptable buying price for a 
gamble / in the domain cannot be raised by considering the acceptable transactions implicit 
in other gambles. In particular, it means that P avoids sure loss. We call an upper prevision 
coherent if its conjugate lower prevision is. 

If a lower prevision P is defined on a linear space of gambles J^, then the coherence 
requirement (|2]l is equivalent to the following conditions: for any gambles / and g in J(f 
and any non-negative real number A, it should hold that: 

(PI) E.{f) > inf / [accepting sure gains]; 

(P2) P{Xf) — XP{f) [non-negative homogeneity] ; 

(P3) Pif + g)>P{f)+Pig) [super-additivity]. 

Moreover, a lower prevision on a general domain is coherent if and only if it can be ex- 
tended to a coherent lower prevision on some linear space. 

A coherent lower prevision that is defined on indicators of events only is called a coher- 
ent lower probability. The indicator of an event A is the {0, 1 }-valued gamble given by 
Ia{x) := I if X G a and Ia{x) :— otherwise. 

On the other hand, a lower prevision P on some set of gambles J{f that avoids sure 
loss can always be 'corrected' and extended to a coherent lower prevision on ^(^), 
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in a least-committal manner: the (point-wise) smallest, and therefore most conservative, 
coherent lower prevision on ) that (point-wise) dominates P_ on ,J(f, is called the 

natural extension of P_, and it is given for all / in ^( by 



E{f) := sup < inf 



f{x)-Y,h[fk{x)-P(M 



k=l 



(3) 



The natural extension of P provides the supremum acceptable buying prices that we can 
derive for any gamble / taking into account only the buying prices for the gambles in Ji^ 
and the notion of coherence. Interestingly, P is coherent if and only if it coincides with its 
natural extension E_ on its domain J^, and in that case E_ is the point- wise smallest coherent 
lower prevision that extends P to ^(^). 



2.3. Linear previsions. If the lower prevision P{f) and the upper prevision P{f) for a 
gamble / happen to coincide, then the common value P{f) — P(f) = P( f ) is cal led the 
subject's (precise) prevision for /. Previsions are fair prices in de Finetti 's (Il974l) sense. 
We shall call them precise probability models, and lower previsions will be called impre- 
cise. Specifying a prevision f on a domain is tantamount to specifying both a lower 
prevision P and an upper prevision P on J(f such that P{f) = P{f) = P{f)- Since then, by 
conjugacy, P{f ) — —P{~f) — —P{—f), it is also equivalent to specifying a lower prevision 
P on the larger and negation invariant domain J{f' :— J€ U — J^^, by letting -P(/) := ^'(/) 
if / £ and -P(/) := —P{—f) if / G This prevision P is then called coherent, or 

linear, if and only if the associated lower prevision P is coherent, and this is equivalent to 
the following condition 



sup 



52 Xk[fk{x)~P{m - £ M8d^)-P{8i^] 



k=l 



1=1 



>0 



gm in ,J€ and any 



for any natural numbers n and m, any gambles f\, . . . , /„ and g\, 
non-negative real numbers Ai, . . . , A„ and pL\, . . . , /i,„. 

A prevision on the set ^(j?r) of all gambles is linear if and only if it is a positive 
(/ > P{f) > 0) and normed (P(l) = 1) real linear functional. A prevision on a general 
domain is linear if and only if it can be extended to a linear prevision on all gambles. We 
shall denote by P(^) the set of all linear previsions on ^{^). 

The restriction of a linear prevision P on ^(J?r) to the set p{,S^) of (indicators of) 
all events, is a finitely additive probability. Conversely, a finitely additive probability on 
p(^) has a unique extension (namely, its natural extension as a coherent lower proba- 
bility) to a linear prevision on ^{^). In this sense, such linear previsions and finitely 
additive probabilities can be considered equivalent: for precise probability models, the 
language of events is as expressive as that of gambles. 

A linear prevision that is defined on indicators of events only, and therefore called a 
coherent probability, is always the restriction of some finitely additive probability. 

There is an interesting link between precise and imprecise probability models, expressed 
through the following so-called lower envelope theorem: A lower prevision P_ on some 
domain is coherent if and only if it is the lower envelope of some set of linear previsions, 
and in particular of the convex set J^{P) of all linear previsions that dominate it: for all / 
in J^, 

P{f)=M{Pif):P&^iP)}, 
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where .£{P) ;= {P G P( JT) : (V/ G J^){P{f) > P{f))}. We can also use the set ^{P) 
to calculate the natural extension of P: for any gamble / on J?r, we have that 

£(/):-inf{P(/):Pe.^(£)}. 

If we have a coherent lower probability defined on some set of events, then there will 
generally be many (i.e., an infinity of) coherent lower previsions that extend it to all gam- 
bles. In this sense, the language of gambles is actually more expressive than that of events 
when we are considering lower rather than precise previsions. As akeady signalled in the 
Introduction, this is the main reason why in the following sections, we shall formulate our 
study of exchangeable lower previsions in terms of gambles and lower previsions rather 
than events and lower probabilities. 

2.4. Important consequences of coherence. Let us list a few consequences of coherence 
that we shall have occasion to use further on. Besides the properties (F{T]l-(Fl3]l we have al- 
ready mentioned that hold when the domain of P is a linear space, the following properties 
hold for a coherent lower prevision whenever the gambles involved belong to its domain: 

(i) P is monotone: if f <g, then P{f) < P{g)- 

(ii) inf/ < Pif) < P{f) < sup/. 

Moreover, coherent lower and upper previsions are continuous with respect to uniform 
convergence of gambles: if a sequence of gambles /„ converges uniformly to a gamble /, 
meaning that for every e > there is some no such that \ f„{x) — f{x)\ < e for all n > no 
and for all x e then £(/„) converges to P{f) and P{fn) converges to P{f)- In particular, 
this implies that a coherent lower prevision defined on some domain can be uniquely 
extended to a coherent lower prevision on the uniform closure of As an immediate 
corollary, a coherent lower prevision on ^{^) is uniquely determined by the values it 
assumes on simple gambles, i.e., gambles that assume only a finite number of values. 
We end thi s section by introducing a number of new notions, which cannot be found in 



WallevI (Il991h . They generalise familiar definitions in standard, measure-theoretic proba- 



bility to a context where coherent lower previsions are used as beUef models. 

2.5. The distribution of a random variable. We shall call a subject's coherent lower 
prevision P on ^(^), modelling his beliefs about the value that a random variable X 
assumes in the set his distribution for that random variable. 

Now consider another set '3^, and a map (p from ^' to then we can consider 
Y :— (p{X) as a random variable assuming values in 3^. With a gamble h on 3^, there 
corresponds a gamble ho (p on X , whose lower prevision is P[h. o ^). This leads us to 
define the distribution of 7 = ^(X) as the induced coherent lower prevision Q on ^(W), 
defined by 

Q{h):^P{ho(p), /le^(^). 
For an event A C '3^, we see that o 9 = I(p-i{A}' where ^"'(A) := {x e JT: (p{x) G A}, 
and consequently Q{A) = P{(p^^{A)). So we see that the notion of an induced lower 
prevision generalises that of an induced probability measure. 

Finally, consider a sequence of random variables X„, all taking values in some metric 
space S. Denote by '^{S) the set of all continuous gambles on S. For each random variable 
X„, we have a distribution in the form of a coherent lower prevision Px on ^{S). Then we 
say that the random variables converge in distribution if for all h G '^{S), the sequence of 
real numbers P_x {h) converges to some real number, which we denote by Piji). The limit 
lower prevision £ on '^(5') that we can define in this way, is coherent, because a point-wise 
limit of coherent lower previsions always is. 
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3. Exchangeable random variables 



We are now ready to recall IWalleyr s (119911 Section 9.5) notion of exchangeability in 



the contex t of the theory of coherent low er previsions. We shal l see that it generalises 



de Finettil 's definition for hnear previsions jde Finettill937lll975h 



3.1. Definition and basic properties. Consider > 1 random variables Xi, . ..,Xn tak- 
ing values in a non-empty and finite set A subject's beliefs about the values that these 
random variables X = {Xi ,.. . jX^) assume jointly in is given by their (joint) distribu- 
tion, which is a coherent lower prevision P^- defined on the set ^( ^'^) of all gambles on 

Let us denote by the set of all permutations of {!,... ,A^}. With any such permu- 
tation n we can associate, by the procedure of lifting, a permutation of also denoted 
by 7Z, that maps any x = (jci, . . . ,xn) in to ;rx := (xjj.|-i-), . . . ,Xj^^j^-^). Similarly, with any 
gamble / on we can consider the permuted gamble nf := f o n, or in other words, 
(7r/)(x) = f{7zx) for all x e JT^. 

A subject judges the random variables Xi, . . . , X^ tohe exchangeable when he is dis- 
posed to exchange any gamble / for the permuted gamble nf, meaning that P'^ {nf — f )> 
o3 for any permutation %. Taking into account the properties of coherence, this means that 

P^-(7r/-/)=P^(/-7r/) = 

for all gambles / on S^'^ and all permutations k in 3^n- In this case, we shall also call 
the joint coherent lower prevision P"^ exchangeable. A subject will make an assumption 
of exchangeability when there is evidence th at th e jjrocesses generating the values of the 
random variables are (physically) similar (IWallevHl99li Section 9.5.2), and consequently 
the order in which the variables are observed is not important. 

When P'^ is in particular a linear prevision P^ , exchangeability is equivalent to hav- 
ing P^{7if) = P%:{f) for all gambles / and all permutations n. Another equivalent for- 
mulation can be given in terms of the (probability) mass function of P^, defined by 
p%-{x) :=P^({x}). Indeed, if we apply linearity to find that P;^,.(/) =Y,^^^afrN f{x)p%-{x), 
we see that the exchangeability condition for linear previsions is equivalent to having 
/7^ (x) — p^f^inx) for all x in ,9^^ , or in other words, the mass function p%,- should be 
invariant under permutation of the indices. This is essentiallv .de Finetti 's (1937.) definition 



for the exchangeability of a prevision. The following proposition, mentioned by IWallev 



(1199 IL Section 9.5), and whose proof is immediate and therefore omitted, establishes an 
even stronger link between Walley's and de Finetti's notions of exchangeability. 

Proposition 1. Any coherent lower prevision on ^(^X^") that dominates an exchangeable 

N 

.r 



coherent lower prevision, is also exchangeable. Moreover, let P^ be the lower envelope 



of some set of linear previsions in the sense that 

P^(/)=min{p|.(/):P|-e^,^.} 
for all gambles f on . Then is exchangeable if and only if all the linear previ- 



in are exchangeable. 



We could easily define exchangeability for vaiiables that assume values in a set that is not necessarily 
finite. But since we only prove interesting results for finite ,9^ , we have decided to use a finitary context from the 
outset. 

^This means that the subject is willing to accept the gamble nf — /, i.e., to exchange / for nf, in return for 
any positive amount of utility e, however small. 
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If a coherent lower prevision is exchangeable, it is immediately guaranteed to be 
also permutabl^ in the sense that 

E!3ri^f) = E^s'if) for gambles / on and all permutations 7t in ,^n. 

The converse does not hold in general. For linear previsions f ^ , permutability is equiva- 
lent to exchangeability, but this equivalence is generally broken for coherent lower previ- 
sions that are not linear]^ 

Clearly, if Xi, are exchangeable, then any permutation X^^ji), ^;i(a?) is 

exchangeable as well, and has the same distribution P^. Moreover, any selection of 
I <n <N random variables from amongst the Xi, . . . , Xn are exchangeable too, and their 
distribution is given by P"^, which is the ^"-marginal of P^, given by ■=P!i/r{f) 
for all gambles / on where the gamble / on is the cylindrical extension of / 
to Jf^, given by f{zi ,...,zn)-= f{zi , . . . ,z„) for all (zi , . . . ,zn) in ■ 

Running example. This is the place to introduce our running example. As we go along, 
we shall try to clarify our reasoning by looking at a specific special case, that is as sim- 
ple as possible, namely where the random variables Xi, we consider can assume only two 
values. So we might be looking at tossing coins, or thumbtacks, and consider modelling 
the exchangeability assessment that the order in which these coin flips are considered is of 
no consequence. More generally, our random variables might be the indicators of events: 
Xjf — Iei^, and then we consider the events Ei, . . . , En to be exchangeable when the order 
in which they are observed is of no consequence. 

Formally, we denote the set of possible values for such variables by B = {0, 1 }, where 1 
and could stand for heads and tails, success and failure, the occurrence or not of an event, 
and so on. In what follows, we shall often call 1 a success, and a failure. 

The joint random variable X = {X\ ,Xn) then assumes values in the space B'^, which 
is made up of all A^-tuples of zeros and ones. As an example, in the case N — 3, two 
possible elements of are (1,0, 1) and (0, 1, 1). These elements can be related to each 
other by a permutation of the indices, i.e., of the order in which they occur, and therefore 
any exchangeable linear prevision should assign the same probability mass to them. And 
any exchangeable coherent lower prevision is a lower envelope of such exchangeable linear 
previsions. 

3.2. Count vectors. Interestingly, exchangeable coherent lower previsions have a very 
simple representation, in terms of sampling without replacementQ To see how this comes 
about, consider any x £ Then the so-called (permutation) invariant atom 

[x] {ttx: n e 3^n} 

is the smallest non-empty subset of that contains x and that is invariant under all 
permutations n in ^at. We shall denote the set of permutation invariant atoms of 



We use the terminology in IWallevI fT99ll Section 9.4). 
^This is an instance of a more general plienomenon: we can generally consider two types of invariance of 
a belief model (a coherent lower prevision) with respect to a semigroup of transformations: weak and strong 
invariance. The former, of which permutability is a special case, tells us that the model or the beliefs are symmet- 
rical (symmetry of evidence), whereas the latter, of which exchangeability is a special case, reflects that a subject 
believes there is symmetry (evidence of symmetry). Strong invariance generall y implies weak invariance but th e 
two notions in general only coincide for linear previsions. For more details, see lDe Cooman and Mirand3 ilOOTi) . 

"7 

Actually this is a special case of a much more general representation result for coherent lower previsions 
on a fini te space that are strongly invariant with respect to a finite group of permutations of that space; see 
jPe Cooman a nd Miranda. 2007) for more details. Here we give a different proof. 
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by It constitutes a partition of the set . We can characterise these invariant atoms 
using the counting maps : — > Nq defined for all x in ^ in such a way that 



is the number of components of the A^-tuple z that assume the value x. Here \A\ denotes the 
number of elements in a finite set A, and No is the set of aU non-negative integers (including 
zero). We shall denote by the vector- valued map from to Nq^ whose component 
maps are the , x & ^ . Observe that actually assumes values in the set of count 
vectors 



Since permuting the components of a vector leaves the counts invariant, meaning that 
(z) = (ttz) for all z G JT^ and tt G S^n, we see that for aU y and z in JT'V 



The counting map can therefore be interpreted as a bijection (one-to-one and onto) 
between the set of invariant atoms and the set of count vectors >yV^, and we can 
identify any invariant atom [z] by the count vector m = (z) of any (and therefore all) 
of its elements. We shall therefore also denote this atom by [m]; and clearly y G [m] if and 
only if (y) = m. The number of elements v(m) in any invariant atom [m] is given by 
the number of different ways in which the components of any z in [m] can be permuted, 
and is therefore given by 



If the joint random variable X = (Xi , . . . ,Xn) assumes the value z in .!^^ , then the 
corresponding count vector assumes the value (z) in .yV^. This means that we can see 
T jr (X) = [Xi,. . . ,Xn) a^a random variable in If the available information about 
the values that X assumes in is given by the coherent exchangeable lower prevision 
- the distribution of X -, then the corresponding uncertainty model for the values that 
(X) assumes in is given by the coherent induced lower prevision 2^ on ^{A'^) 
- the distribution of (X) -, given by 



for all gambles h on J^^- We shall now prove a theorem that shows that, conversely, 
any exchangeable coherent lower prevision £^ is in fact completely determined by the 
corresponding distribution 2^ of the count vectors, also called its count distribution. It 
also establishes a relationship between exchangeability and sampling without replacement. 

To get where we want, consider an urn with balls of different types, where the differ- 
ent types are characterised by the elements x of the set ^ . Suppose the composition of the 
urn is given by the count vector m G ^y^^, meaning that mx balls are of type x, for x G ^ . 
We are now going to subsequently select (in a random way) A' balls from the urn, without 
replacing them. Denote by yjt the random variable in ^ that is the type of the k-ih ball 
selected. The possible outcomes of this experiment, i.e., the possible values of the joint 
random variable Y = (Fj , . . . ^Y^) are precisely the elements z of the permutation invariant 
atom [m], and random selection simply means that each of these outcomes is equally likely. 
Since there are v(m) such possible outcomes, each of them has probability l/v(m). Also, 



7f (z) = 7f (zi, . ..,ZN):=\{ke{\,...,N}:zk=x}\ 





T'^^-(y)=T'v^-(z). 





(4) 
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any z not in [m] has zero probability of being the outcome of our sampUng procedure. This 
means that for any gamble / on ^^'^ , its (precise) prevision (or expectation) is given by 

MuHf^{f\m):^^ 2: /W- 

^ ' z6[m] 

The linear p revision MuH}^ (■ \m ) is the one associated with a multiple hyper-geometric 
distribution jjohnson et al. , 1997 , Chapter 39), whence the notation. Indeed, for any x — 
(xi , . . . ,x„) in where 1 <n<N, the probability of drawing a sequence of balls x from 
an urn with composition m is given by 

v(m) v(M) 4r VAixA V«, 

where fl = T"^{x). This means that the probability of drawing without replacement any 
sample with count vector jU is v(jLt) times this probability [there are that many such sam- 
ples], and is therefore given by 

v(m-ju)v(/j) _ ^ f'M/f^ 

which indeed gives the mass function for the multiple hyper-geometric distribution. For 
any permutation tt of { 1 , . . . , A^} 

M«H4,-(;r/|m) = -i- ^ /(;rz) = -i- ^ /(z) =M«//y^-(/|m), (5) 

^(^) ze[m] ^^^) 7r-lzG[m] 

since ;r^'z e [m] iff z G [m]. This means that the linear prevision MuHy^ {-Im) is ex- 
changeable. The following theorem establishes an even stronger result. 



Theorem 2 (Representation theorem for finite sequences of exchangeable variables). Let 



N > I and let P^r^^ be a coherent exchangeable lower prevision on ^{J^^). Let f be any 



gamble on . Then the following statements hold: 

1. The gamble f on given by f :— T,7ie,0^N permutation invariant, meaning 
that Kf = f for all K G ^a?. It is therefore constant on the permutation invariant atoms 
of and also given by 

f= L InMuHy^^iflm). (6) 

2- P!i- (/-/)= Psr (/-/)= 0, and therefore also P%- (/) = P%.- (/). 

3- P%: if) = Q%- {MuH/I^ (/I •)), where MuHy^ (/| •) is the gamble on jV^ that assumes 
the value Mu}iy^^{f\\a) in m e 

Consequently a lower prevision on S^i^Sf^^) is exchangeable if and only if it has the form 
2(MM//yj-(-|-)), where Q is any coherent lower prevision on ^{jV^). 

Proof. The first statement is fairly immediate. We therefore turn at once to the second 
statement. Observe that / — / = j^^^Y.ns3'N\-f ~ ^f\- Now use the coherence [super- 
additivity and non-negative homogeneity], and the exchangeability of the lower prevision 
to find that 

i^(/-/)>^ L p^{f-nf)=o. 
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In a completely similar way, we get £^ [f — f) > 0. Since it also follows from the coher- 
ence [super-additivity] of P%- that -f) < £^(0) = 0, we find that 
indeed P^- (f ~ f) = (/-/)= 0. Now let g / - /, then / = / + g and / - / - g, 
and use the coherence [super-additivity and accepting sure gains] of to infer that 

whence indeed (/) = P%. (/) . 

To prove the third statement, use P^^- (/) — P'^ (/) together with Equations ^ and ^ 

to find that f^, (/) =P^(/) =e;^(M«//y^.(/|.)). 

These statements imply that any exchangeable coherent lower prevision is of the form 
Q{MuHy^^{-\-)), where e is some coherent lower prevision on ^(^/K^). Conversely, if Q 
is any coherent lower prevision on ^(o/K<^), then Q{MuHy^{-\-)) is a coherent lower 
prevision on ^{^'^) that is exchangeable: simply observe that for any gamble / on 
and any n G ^n, 

QiMuHy%^{f^7zf\-))^QiMuHy%:{f\-) ~ MuHy%{nf\-)) = Q{Q) ^Q, 

taking into account that each MuHy'^!|,-{■\m) is an exchangeable linear prevision [Equa- 
tion©]. " □ 

This theorem implies that any exchangeable coherent lower prevision on can be 
associated with, or equivalently, that any collection of exchangeable random variables 
in ^ can be seen as the result of, random draws without replacement from an urn 
with balls whose types are characterised by the elements x of S^' , whose composition m 
is unknown, but for which the available information about the composition is modelled by 
a coherent lower prevision on J^{jV^)^ 

That exchangeable linear previsions can be interpreted in terms of sampling without 
replacement from an urn with unknown composition, is of cours e well-known, an d es- 
sentially goes back to de Fine t ti's work on e xchangeability; see ( de Finettii 11937 *) and 



dCifarelU and Regazzinii Il996h . iHeath and Sudderth C1976) give a simple proof for vari- 



ables that may assume two values. But we believe our proofl for the more general case of 
exchangeable coherent lower previsions and random variables that may assume more than 
two values, is conceptually even simpler than Heath and Sudde rth's proof, even though it 



is a s pecial case of a much more general representation result (IDe Cooman and Miranda . 



20071 Theorem 30). The essence of the present proof in the special case of linear previsions 
P is captured wonderfully well bv Zabell' s (.1992. Section 3.1) succinct statement: "Thus 
P is exchangeable if and only if two sequences having the same frequency vector have the 
same probability." 

Running example. We come back to the simple case considered before, where ^ = B. Any 
two elements x and y of can be related by some permutation of the indices {1 , . . . ,A^} 
iff they have the same number of successes s — r/^(x) = [y) (and of course, the same 
number of failures f — N — s). We can identify the count space — {{s, f) : s + f = N} 



^When P^-, and therefore also gj^. , is a linear prevision, i.e., a precise probability model, this interpre- 
tation follows from the Theorem of Total Probability, by interpreting the Mn//j''^-( |m) as conditional previ- 
sions, and 2^ as a marginal. For imprecise models and g"^ , the vali dity of t his interpretation follows 
by analogous reasoning, using Walley's Marginal Extension Theorem; see IWallevI il99lL Section 6.7) and 
iMiranda and De Coomm] )2006h . 

' Wallev ( .199 iL Chapter 9) also mentions this result for exchangeable coherent lower previsions. 
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with the set {s: s = 0,... ,N}, and count vectors m = (,s-,A^ — s) with the corresponding 
number of successes s, which is what we shall do from now on. 

The 2^ elements of are divided into + 1 invariant atoms [s] of elements with the 
same number of successes s, each of which has v{s) — (^) — s'.(n-s)'. dsnicnts. We have 
depicted the situation for = 3 in Figure [T] 



(0,0,0) 



(1,0,0) (0,1,0) (0,0,1) 



(1,1,0) (1,0,1) (0,1,1) 



(1,1,1) 



s = 



i = 1 



s = 2 



s = 3 



Figure 1 . The four invariant atoms [s] in the space ^1^, characterised 
by the number of successes s. 



Exchangeability forces each of the elements within an invariant atom [s] to be 'equally 
likely'. So each [s] is to be considered as a 'lump', within which probability mass is dis- 
tributed uniformly. The only freedom exchangeability leaves us with, lies in assigning 
probabilities to the lumps [s]. This is the essence of Theorem |2] which tells us that any 
exchangeable coherent lower prevision on ^(B^) can be seen as the composition of 
a coherent lower prevision on ^({0, 1, . . . ,A^}), representing behefs about the num- 
ber of successes s, and the hyper-geometric distributions on [s], which guarantee that the 
probability is distributed uniformly over each of the v{s) — CJ) elements of [s]: for any 
gamble / on B^, 

H/if\s) ■.^MuH4if\s,N-s) = L /(x). 

For an exchangeable random variable X = {Xi,... ,Xi^), with (exchangeable) distribu- 
tion P^2- on ^(^^), we have seen that we can completely characterise this distribution 
by the corresponding distribution of the count vectors Q^^ on Ji^'{^,^). 

We have also seen that any selection of 1 <n<N random variables from amongst the 
Xi, ■ ■ ■ ,Xn will be exchangeable too, and that their distribution is given by P'^, which 
is the ^"-marginal of P_^^-. There is moreover an interesting relation between the dis- 
tributions 2^ and Q"^ of the corresponding count vectors, which we shall derive in 
the next section (Equ ation (|9])). On the other hand, it is well-known (see for instance 
Diaconis and Freedman (1980 ); we shall come back to this in Section|7]i that if we have an 
exchangeable A^-tuple {Xi,... ,Xn), it is not always possible to extend it loan exchangeable 
1 -tuple. In the next section, we investigate what happens when we consider exchange- 
able tuples of arbitrary length. 



4. Exchangeable sequences 

4. 1 . Definitions. We now generalise the definition of exchangeability from finite to count- 
able sequences of random variables. Consider a countable sequence Xi, . . ., Xn, ... of 
random variables taking values in the same non-empty set 3^ . This sequence is called 
exchangeable if any finite collection of random variables taken from this sequence is ex- 
changeable. This is clearly equivalent to requiring that the random variables Xi, .. ., X„ 
should be exchangeable for all « > 1 . 
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We can also consider the exchangeable sequence as a single random variable X assum- 
ing values in the set JT^, where N is the set of the natural numbers (positive integers, 
without zero). Its possible values x are sequences xi, . . . , x„, . . . of elements of or 
in other words, maps from N to We can model the available information about the 
value that X assumes in by a coherent lower prevision P^- on J2f{^^), called the 
distribution of the exchangeable random sequence X. 

The random sequence X, or its distribution P^, is clearly exchangeable if and only if all 
its -marginals P^^ are exchangeable for n > 1. These marginals P"^ on are 
defined as follows: for any gamble / on P"^{f) '■= P^{f), where / is the cyUndrical 
extension of / to defined by /(x) := f{xi,. . . ,x„) for all x = {xi,. . . ,x„,x„+i ,.. .) 
in In addition, the family of exchangeable coherent lower previsions P"^, n > 1, 

satisfies the following 'time consistency' requirement: 



f"r(/)=£"r'(/): (7) 



for all « > 1, k>0, and all gambles / on where now / denotes the cylindrical exten- 
sion of / to J""+'^: P"^. should be the jr"-marginal of any 

It follows at once that any finite collection of « > 1 random variables taken from such an 
exchangeable sequence has the same distribution as the first n variables Xi, . . . , X„, which 
is the exchangeable coherent lower prevision P"^ on ^( J^T"). 

Conversely, suppose we have a collection of exchangeable coherent lower previsions 
P_"^ on n > 1 that satisfy the time consistency requirement dTjl. Then any co- 

herent lower prevision P^ on ^(^^) that has ^"-marginals P"^ is exchangeable. The 
smallest, or most conservative such (exchangeable) coherent lower prevision is given by 



:= supP«^(proj (/)) = limP«,.(prqj^ (/)), 



where / is any gamble on and its lower projection proj^(/) on .S^" is the gamble 

on that is defined by prqj^ (/) (x) :— inf-j-vj:_/t=i k/(z) for all x e JT", i.e., the lower 

projectio n of / on x is the infimum of / ov er the elements of whose projection on 



is X. See ( De Cooman and Miranda , 20061 Section 5) for more details. 



4.2. Time consistency of the count distributions. It will be of crucial interest for what 
follows to find out what are the consequences of the time consistency requirement (|7]i on 
the marginals P'^ for the corresponding family n > 1, of distributions of the count 
vectors T"^-{Xi, . . . ,X„). Consider therefore « > 1, ^ > and any gamble h on ^^-^ Let 
/:=/!oT'_^-, then 

=£"r (/)=£"/(/) = e?'(A^«ff/^'(/|-)), 

where the first equality follows from Equation (|4]i, the second from Equation (|7]i, and the 
last from Theorem|2] Now for any m' in and any z' = (z,y) in ^"+'^ = JT" x JT* 



14 



GERT DE COOMAN, ERIK QUAEGHEBEUR, AND ENRIQUE MIRANDA 



we have that T"+'^(z') = T«^- (z) + T*^- (y) and therefore 



MuHy"+\f\m!) 
1 



I />') - ;4 E m EE /M 

^ ' z'G[m'J ^ ' (z,y)G[m'] > i 



m<m 



1 ^ v(m' — in)v(m) , , 

-— £ v(m'-m)v(m)M«///^(/|m)= .J. ^ ' Km), 

m<m' 



(8) 



since MuHy"oj-{f\va) — h(va), and v(in' — m) is zero unless m < m'. So we see that time 
consistency is equivalent to 



v(-) 



(9) 



for all « > 1, A: > and /z G ^{.yV^\ 



5. A REPRESENTATION THEOREM FOR EXCHANGEABLE SEQUENCES 



De Finettil (119371 Il975h has proven a representation result for exchangeable sequences 
with hnear previsions that generalises Theorem |2] and where multinomial distributions 
take over the role that the multiple hyper-geometric ones pla y for finite collections of ex- 
changeable variables. One simple and intuitive way (see also Ide Finettiiri975. p. 218)to 
understand why the representation result can be thus extended from finite collections to 
countable sequences, is based on the fact t hat the multinomial d istribution can be seen as 
as limit of multiple hyp er-geometric ones jjohnson et aL , 1997 , Chapter 39). This is also 
the central idea behind Heath and Sudderthl 's ( 1976 ) simple proof of this representation 
result in the case of variables that may only assume two possible values. 

However, there is another, arguably even simpler, approach to proving the same results, 
which we present here. It also works for exchangeability in the context of coherent lower 
previsions. And as we shall have occasion to explain further on, it has the additional ad- 
vantage of clearly indicating what the 'representation' is, and where it is uniquely defined. 

We make a start at proving our representation theorem by taking a look at multinomial 
processes. 



5.1. Multinomial processes are exchangeable. Consider a sequence of random variables 
Yi, . . . ,Y„, . . . that are mutually independent, and such that each random variable Y„ has the 
same probability mass function 6: the probability that Yn= xi^ Bx for x ^ Observe 
that 6 is an element of the ^-simplex 



•^^ = )de R'^' : (Vjc G > 0) and £ 9^ 



1 



Then for any n > 1 and any z in the probability that {Yi,...,Y„) is equal to z is given 

by rivG fr which yields the multinomial mass function jjohnson et al. , 1997 , Chap- 

ter 35). As a result, we have for any gamble / on that its corresponding (multinomial) 
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In other words, the random variables are IID. 
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prevision (expectation) is given by 

Mn\{f\e) = ^ /(z) n 0^'^' = L I /(^) n 

= ^ MuHy\{f\m)v{m)Y{df^ 

me.AJ- xef£~ 

= CoMn''^,;- {MuHy%- (/| • ) 1 6 ) , (10) 
where we defined the (count muhinomial) Hnear prevision CoMn\- {■\d) on ^{,y\^^- ) by 
CoMn"^{g\e)= §(m)v(m) n 0;'% dD 

where g is any gamble on ,yV^^ . The corresponding probability mass for any count vector m, 
namel}0 

CoMn\-{{m}\e) = v(m) J] =: B„(e), (12) 

is the probability of observing some value z for (Fi , . . . ,7,,) whose count vector is m. The 
polynomial function Bja on the ^-simplex is called a (multivariate) Bernstein (basis) poly- 
nomial. We have listed a number of very interesting properties for these special polynomi- 
als in the Appendix. One important fact, which we shall need quite soon, is that the set 
{Bm'- m e --^f:^ of all Bernstein (basis) polynomials of fixed degree n forms a basis for 
the linear space of all (multivariate) polynomials on E jr whose degree is at most n; hence 
their name. If we have a polynomial p of degree m, this means that for any n > m, p has a 
unique (Bernstein) decomposition b"^ G ^(^^-) such that 

p^ b;{m)B^. 

If we combine this with Equations (fTTT l and ( fT2] l. we find that b^ is the unique gamble 
on ^ such that CoMn"^{b"p\-) = p. 

We deduce from Equation ( fTOl l and Theorem |2] that the linear prevision Mn^ ( jO) on 
Ji'i^") - the distribution of (^i, . . . ,Y„) - is exchangeable, and that CoMn"^{-\6) is the 
corresponding distribution for the corresponding count vectors T"^ {Yi,... ,¥„). Therefore 
the sequence of IID random variables Yi, . . . , Y„, . . . is exchangeable. 

Running example. Let us go back to our example, where = B. Here the B-simplex 
= {{(^, 1 — 0) : e [0, 1]} can be identified with the unit interval, and every element 
6 = (0, 1 — 0) can be identified with the probability of a success. 

The count multinomial distribution CoMn'^{-\d) now of course turns into the (count) 
binomial distribution CoBi"{-\9) on ^({0, . . . ,«}), given by 

CoBi"{8\0) := Yg{s) (") e\l 0)"- = t g{s)B:{e) (13) 
1=0 V'V 1=0 

for any gamble g on the set {0, 1, . . . ,«} of possible values for the number of successes s. 
In this expression, the B"{ 9) := ('')0^(1 — 0)"~^ are the n + 1 (univariate) Bernstein basis 
polynomials of degree n jLorentz , 19861 : Ift^utzsch et al. , 2002). For fixed n, they add 



up to one and are linearly independent, and they form a basis for the linear space of all 
polynomials on [0, 1] of degree at most n. 



We assume implicitly that a — 1 for all a>0. 
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5.2. A representation theorem. Consider the following linear subspace of ^(E ): 

r(E,r) := {CoMn\ig\-): « > l,g £ ^(^r)} = {Mn\\f\-): « > 1,/ G J^{Sr")}, 
each of whose elements is a polynomial function on the ^-simplex: 

CoMn\{g\d)= g(m)v(m) n e;'- = L s{r^)Bm{B), 

and is actually a linear combination of Bernstein basis polynomials Bj„ with coefficients 
g{m). So YCL^f-) is the linear space spanned by all Bernstein basis polynomials, and is 
therefore the set of all polynomials on the ^-simplex E,^ . 

Now if is any coherent lower prevision on .if (Eg*-), then it is easy to see that the 
family of coherent lower previsions P"^, n> I, defined by 

£"r(/)=^.r(^«;r(/l-)), (14) 
is still exchangeable and time consistent, and the corresponding count distributions are 
given by 

Q%-{f)^RriCoMn%ig\-)), ^e^(^l-). (15) 
Here, we are going to show that a converse result also holds: for any time consistent 
family of exchangeable coherent lower previsions P"^-, n > 1, there is a coherent lower 
prevision R ^^ on 1^(E,£ ) such that Equation (O, or its reformulation for counts ( fTSt . 
holds. We shall call such an R ^^ a representation, or representing coherent lower prevision, 
for the family P"^. Of course, any representing Ra^^-, if it exists, is uniquely determined 
onr(E,r). 

So consider a family of coherent lower previsions Q"^ on that are time con- 

sistent, meaning that Equation (|9]l is satisfied. It suffices to find an R ^^- such that (flSl l 
holds, because the corresponding exchangeable lower previsions P_"^ on ^(^'") are then 
uniquely determined by Theorem|2] and automatically satisfy the condition (fT4l l. 

Our proposal is to define the functional R ^- on the set ^(Ejr) as follows: consider any 
element p ofVilLx)- Then, by definition, there is some n > I and a corresponding unique 
b"p e ^(^J-) such that p = CoMn%-{b"p\-). We then let Rojip) Q%-ib"p). 

Of course, the first thing to check is whether this definition is consistent: any polynomial 
p of degree m has unique representations bp for all n > m, which means that we have to 
check that no inconsistencies can arise in the sense that Q'lj^ {b'p ) Q"^ {b"p ) for some 
«i ,«2 > It turns out that this is guaranteed by the time consistency of the P^^-, or that of 
the corresponding Q"^-, as is made apparent by the proof of the following lemma. 

Lemma 3. Consider a polynomial of degree m, and let ni,n2 ^ 'W- Then Q"a^{b"p) — 

Proof. We may assume without loss of generaUty that n2 > «!• The Bernstein decomposi- 
tions b'p and b"p^ are then related by Zhou's formula [see Equation ( l22b in the Appendix]: 

X ^(in2 -ini)v(mi) , , ^^„, 

b"/im2) = £ ' ' b"; (mi), m2 G ^J^. 

Consequently, by the time consistency requirement (|9]l, we indeed get that Q"^{b"p) = 

We also have to check whether the functional thus defined on the linear space Yf- is a 
coherent lower prevision. This is established in the following lemma. 

Lemma 4. R^ is a coherent lower prevision on the linear space "^(Ej: ). 
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Proof. We show that R of^ satisfies the necessary and sufficient conditions (F[T]i-(F[3]l for 
coherence of a lower prevision on a linear space. 

We first prove that (F[T]i is satisfied. Consider any p £ 'f{L«^). Let m be the degree of p. 
We must show that Rof; (p) > min p. We find that Rt^- (p) — Q"^- (fp) > rain bp for all n>m, 
because of the coherence [accepting sure gains] of the count lower previsions Q"^- But 
Proposition[8]in the Appendix tells us that minbp | minp, whence indeed R^^ (p) > min p. 

Next, consider any p in ) and any real A > 0. Consider any n that is not smaller 

than the degree of p. Since obviously b"^^_^ = Xb"p, we get 

where the third equality follows from the coherence [non-negative homogeneity] of the 
count lower prevision Q"^- This tells us that the lower prevision Rof^ satisfies the non- 
negative homogeneity requirement (FO. 

Finally, consider p and q in 7^(E jf ), and any n that is not smaller than the maximum of 
the degrees of p and q. Since obviously bp^^ = b'p + b^^, we get 

where the inequality follows from the coherence [super-additivity] of the count lower pre- 
vision Q"af^- This tells us that the lower prevision R a^^ also satisfies the super-additivity 
requirement (iQ and as a consequence it is coherent. □ 

We can summarise the argument above as follows. 

Theorem 5 (Representation theorem for exchangeable sequences). Given a time consistent 
family of exchangeable coherent lower previsions P"^- on ^(^"), n>\, there is a unique 
coherent lower prevision R^- on the linear space YilLs;) of all polynomial gambles on 
the -simplex, such that for all n > I, all f £ J^{^") and all g £ 

P\{.f)=R9:{Mn\{f\-)) and g;^(g) ^ R^{CoMn\{g\-)). (16) 

Hence, the belief model governing any countable exchangeable sequence in ^ can be 
completely characterised by a coherent lower prevision on the linear space of polynomial 
gambles on E,^:- . 

In the particular case where we have a time consistent family of exchangeable linear 
previsions on ^(^"),n > 1, then Ro^- will be a linear prevision on the linear 
space of all polynomial gambles on the ^-simplex. As such, it will be charac- 

terised by its values R,f:{Bm) on the Bernstein basis polynomials B^, m £ n > 1, or 
on any other basis of '^^(E,^-). 

It is a consequence of coherence that R o^r is also uniquely determined on the set "^(E jir ) 
of all continuous gambles on the ^-simplex Ejr: by the Stone-WeierstaB theorem, any 
such gamble is the uniform limit of some sequence of polynomial gambles, and coherence 
implies that the lower prevision of a uniform limit is the limit of the lower previsions. 

This unicity result cannot be extended to more general (discontinuous) types of gambles: 
the coherent lower prevision R i^- is not uniquely determined on the set of all gambles 
.jSf (E,ar) on the simplex: and there may be different coherent lower previsions R}^- and 
R^s/r on ^(E,£-) satisfying Equation (fT6l)F^ But any such lower previsions will agree on 



^^See lMiranda et ^ i2007l) for a study of the gambles whose prevision is determined by the prevision of the 
polynomials. 
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the class of polynomial gambles, which is the class of gambles we need in order 

to characterise the exchangeable sequence^ 

We now investigate the meaning of the representing lower prevision ^ a bit further. 
Consider the sequence of so-called /re^Mency random variables F„ := T"^^,- {X\,. . . ,Xn)/n 
corresponding to an exchangeable sequence of random variables Xi, .. ., X„, .. ., and as- 
suming values in the ^-simplex E f: . The distribution Pp of F„, i.e., the coherent lower 
prevision on ^{T.^;) that models the available information about the values that F„ as- 
sumes in E,^ -, is given by 

&„ ih):^Q"^{ho-)^ R,^- {CoMn\ {,ho-\.)), /i £ (E,r ) , 
" — ^ n n 

because we know that Q"j^ is the distribution of T"^{Xi, . . . and also taking into ac- 
count Theorem|5]for the last equality. Now, 

CoMn"^{ho-\e)^ £ 

is the Bernstein approximant or approximating Bernstein polynomial of degree n for the 
gamb le h, and it is a known result (see (Feller, 1971, Section VII. 2), (Heitzinge r et al.L 
20031 Section 2)) that the sequence of approximating Bernstein polynomials CoMn"^- {h o 
-\-) converges uniformly to h for n ^ oo if /; is continuous. So, because is defined 
uniquely, and is uniformly continuous, on the set "^(E gr), we find the following result, 
which provides an interpretation for the representation R^^-, and which can be seen as 
another generalisation of de Finetti's Representation Theorem: R^ is the limit of the fre- 
quency distributions. 

Theorem 6. For all continuous gambles h on E^jr, we have that 

^mP^„{h)=R^{h), 

or, in other words, the sequence of distributions Pp converges point-wise to R<^- on 
■^(E^), and in this specific sense, the sample frequencies F„ converge in distribution. 

Running example. Back to our example, where 3y —M. Here the Representation Theorem 
(Theorem|5]) states that the coherent count lower previsions Q'^, n > 1, for any exchange- 
able sequence of variables in B have the form 

^^{g)=MCoBi'^{g\-)). 
for all gambles g on the set {0, 1, . . . ,n} of possible numbers of successes s, where the 
(count) binomial distribution CoBi" {•]()) is given by Equation ( fT3] l. and R^ is some coher- 
ent lower prevision defined on the set "/([0, 1]) of all polynomials on [0, 1], which is the 
set of possible values for the probability of a success. 

This R^ can be uniquely extended to a coherent lower prevision on the set "^^([0, 1]) 
of all continuous gambles (functions) on [0, 1]. And Theorem|6]assures us that this R^ on 
^([0, 1]) is the 'limiting distribution' of the frequency of successes F" = T"{Xi, ...,X„)/n, 
as the number of 'trials' n goes to infinity. 

When all the count distributions Q'^ are linear previsions Q'^, then the representation R^ 
is a Hnear prevision R^, and vice versa. This linear prevision on '^{[0, 1]), or equivalently, 
on y{[Q, 1]) is completely determined by (and of course completely determines) its values 



We refrain here from imposing conditions other than coherence (e.g., relat ed to g-additiv i ty) on such exten- 
sions, which could guarantee unicity on the set of all measurable gambles; see lMiranda et alj i2007h for related 
discussion. 
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on any basis of the set of polynomials on [0, 1]. If we take as a basis the set {9" : n > 0}, 
then we see that 7?^ is completely determi ned by its (raw ) moment sequence m„ — R^{0"), 



n >0. It is well-known (see for instance lFelleiill97li Section VII. 3) that in the case of 



finitely additive probabilities, or linear previsions, a moment sequence uniquely determines 
a distribut io n func tion, except in its discontinuity points. And this brings us right back to 
Ide Finettil 's il93 f) version of the Representation Theorem: "la loi de probabilite 'i>„{^) = 
P{Yn ^ ^) tend vers une limite pour n ^ °°. [. . . ] il s'ensuit qu'il existe une loi-limite 
telle que lim„^oo <t>„ ((^ ) = ^{^ ) saufpeut-etre pour les points de discontinuite.'^ 

6. Looking at the sample means 

Consider an exchangeable sequence Xi, . . . , X„, . . . , and any gamble / on Then the 
sequence f{Xi), ... , f{X„), ... is again an exchangeable sequence of random variables, 
now taking values in the finite set f{^'). We are interested in the sample means 

S„{f){Xu...,X„):^-j^f{X,) 
" k=i 

which form a sequence of random variables in [inf/, sup/]. For any m in and any 
z e [m], 

Sn{f){z) = -t fi^k) = - E mj{x) Ss: (f\- 
" k=i "xG.r " 

where for each 6 G Ej;, we have defined the linear prevision S,f;{-\6) on ^(JT) by 
S,f:{f\^) ■— T.xe."Z' fi^)(^x- Observe that 5',^(/|-) is a very special (linear) polynomial 
gamble on the ^-simplex. We then get 

MuHyUS„if)\m) = ^ E ~ E (/l") ^ (/l" 

^ ' ze[nij ^ ' zG[m] 

SO we find for the distribution Ps^j{f) of the sample mean S„{f), which is a coherent lower 
prevision on Jiff {[inf f, sup/]), that 

£s„(/)(/^) = e:^-(K5r(/|-))°^), e=S^([inf/,sup/]). 

In terms of the representing lower prevision ^ jr, we see that 

CoMn%-{h{Ssiif\-)o-)\e)= E h{S.9:{f\-))Bn,{e) 

is the approximating Bernstein polynomial for the gamble h{S {f\-)) on Ejr. So for all 
continuous gambles h on [inf/, sup/], h{Ss:{f\')) is a continuous gamble on T,,^, and is 
therefore the uniform limit of its sequence of approximating Bernstein polynomials. Since 
a coherent lower prevision is uniformly continuous, we see that 

\im Ps„^f){h)=R^ih{SM.fm- (17) 

This tells us that for an exchangeable sequence Xi, . . . , X„, . . . the sequence of sample 
means S„{ f){Xi ,X„) converges in distribution. 



^^Our italics. In de Finetti's notation, is our F,', and <!>„ its distribution function. 
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7. Exchangeable natural extension 

Throughout this paper, we have always considered exchangeable lower previsions 
defined on the set ^{S^^) of all gambles on S^^^ . At first sight, it seems an impossible 
task to specify or assess such an exchangeable lower prevision: a subject must specify 
an uncountable infinity of supremum acceptable prices, and at the same time keep track 
of all the symmetry requirements imposed by exchangeability, as well as the coherence 
requirement. 

Alternatively, a subject must specify a coherent count lower prevision 2^- on .if (^^), 
and this means specifying an uncountable infinity of real numbers (J^^ (g), for all gambles g 
on 

Is it therefore realistic, or of any practical relevance, to consider such exchangeable 
coherent lower previsions? Indeed it is, and we now want to show why. 

7.1. The general problem. What will usually happen in practice, is that a subject makes 
an assessment that variables X\, . . . , X^r taking values in a finite set ^ are exchange- 
ableH and in addition specifies supremum acceptable buying prices for all gambles 
in some (typically finite, but not necessarily so) set of gambles ,y(f C ^(^'^). The ques- 
tion then is: can we turn these assessments into an exchangeable coherent lower prevision 
defined on all of^{^^), that is furthermore as small (least-committal, conservative) 
as possible? 

To answer this question, we begin by looking at the most conservative (i.e., point-wise 
smallest) exchangeable coherent lower prevision Kgg^ for variables. Since the most 
conservative coherent lower prevision on ^{.yV^) is the vacuous lower prevision, given by 
(g) = min^^^N g{m), our Representation Theorem for finite exchangeable sequences 
(Theorem|2]) tells us that 

E.^Mf)= mm MuH/^{f\m) (18) 

for all gambles / on whose corresponding count lower prevision is vacuous. It models 
a subject's beliefs about sampling without replacement from an urn with N balls, where this 
subject is completely ignorant about the composition of the urn. 

Using this we can invoke a general theorem we have proven elsewhere, about 

the existence of coherent lower previsions that ar e (strongly) invariant under a monoid of 
transformations ( De Cooman and Miranda , 2007 , Theorem 16) to find thaQ 



ENE-1. there are exchangeable coherent lower previsions on ^{^^) that dominate P on 
if and only if 



Y.^kifk-Pifk)])>0 forall«>0,Ai>Oand/^e (19) 

k=l 



'^When is a linear prevision 2^-, it suffices to specify a finite number of real numbers 2j;-({m}), for 
m in o/K^, but such an exQemely efficient reduction is generally not possible for coherent count lower previ- 
sions (/if-- 

This is a so-called structural assessment in IWalle\j 's imt) terminology. 
'^Equation (19) is closely related to the avoiding sure loss condition (T), but where the supremum is replaced 
by the coherent upper prevision £,32^- Similarly, Equation )20t is related to the expression (3) for natural exten- 
sion, but where the infimum operator is replaced by the coh erent lower prevision £ . There is a small and easily 
correctable oversight in the formulation of Theorem 16 of IPe Cooman and Miranda j2007h . as becomes imme- 
diately apparent when considering its proof: it is there (but should not be) formulated without the multipliers 
h>0- 
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ENE-2. in that case the point-wise smallest (most conservative) exchangeable coherent 



lower prevision Ep g^i on ^[^^) that dominates P on is given by 



Ep„^,{f) sup [f-Y.^h[fk-P{fk)]j ■n>Q,Xk> 0,fk e J^j , (20) 

and is called the exchangeable natural extension of P. 
If we now combine Equation ( fTSl ) with Equations ( fT9] l and (|20] |. and define the lower 
prevision Q on the set 

Ji^ := {M«i/y^(/|.) : / e JT} C if (._^^) 

by lettin£3 

Qig) sup {£(/) : MuHyl\f\-) = g,f £ JT} 
for all g S then it is but a small technical step to prove the following result. 

Theorem 7 (Exchangeable natural extension). There are exchangeable coherent lower 
previsions on .if ( ,^^) that dominate P on if and only if Q is a lower previsio^^ on 
that avoids sure loss. In that case E^p — £g(MM//y^(-|-)), i.e., the count distribution 
for the exchangeable natural extension E_p ofP_ is the natural extension E^q of the lower 
prevision Q. 

Since there are quite efficient algorithms (■Wallev et all 12004) for calculating the natural 



extension of a lower prevision based on a finite number of assessments, this theorem not 
only has intuitive appeal, but it provides us with an elegant and efficient manner to find 
the exchangeable natural extension, i.e., to combine (finitary) local assessments P with the 
structural assessment of exchangeability. 

7.2. From n to n + k exchangeable random variables? Suppose we have n random vari- 
ables Xi, . . .,X„, that a subject judges to be exchangeable, and whose distribution is given 
by the exchangeable coherent lower prevision on ^( JF"), with count distribution Q"^ 
on .if (^/f^). Can this model be extended to a coherent exchangeable model for n + k vari- 
ables? And if so, what is the most conservative such extended model? 

It is well-known that when is a linear prevision, it cannot generally be extended 
( Diaconis and Freedmani[l980l) . In the more general case that we are considering here, we 



now look at our Theorem|2]to provide us with an elegant answer: the problem considered 
here is a special case of the one studied in Section fTTl 

Indeed, if we denote, as before in Section im by / the cylindrical extension to ^"^'^ 
of the gamble / on then we see that the local assessments P are defined on the set 

of gambles .JT := |/: / e C by £{/) := P"^{f), f E ^( JT"). Ob- 

serve that here N = n + k. If we recall Equation (O in Section 14.21 then we see that the 
corresponding set C ^(^^!.+*^) is given by 

Ji^:^{g:gE^i^^)}, 

where for any gamble g on ._yf^ and all /i G -^-^^ 

_, , V- v(m)v(/t-m) I N 



'^Observe that it is necessary that Q(g) should be finite, in order for the condition )19t to hold. 
'^The explicit requirement that Qisa lower prevision means that Q must be nowhere infinite. 
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where P{-\n) is the Hnear prevision associated with drawing n balls without replacement 
from an urn with composition ju. Moreover, for any h in J^, there is a unique gamble g 
on such that h — J0 This implies that the corresponding lower prevision Q on is 
given by 

Now observe that 

(a) X = X for all real X ; 

(b) Ig^Xg for all g in ^(^") and all real A; 

(c) gi +g2^8i +l2forallgi andg2 inif(jr"). 

This tells us that is a linear subspace of that contains all constant gambles. 

Moreover, because Q"^- is a coherent lower prevision, we find that 

(i) Qihi+h2) >Qihi)+Q{h2) for all /zi and /12 in ; 

(ii) QiXh) = XQ{h) for all real A > and all h in Jf; 

(iii) + A) Q{h) + X for all real A and all h in . 

Because Q and have these special properties, the condition for P'^ to be extendable 
to some coherent exchangeable model for n + k variables, namely that Q avoids sure loss 
on J^, simphfies to maxg > Q{g) for all g G JiC{-jV'j.-), i.e., to 

E ^('"MM-'") g(^)>Q» (g) forallgei-(^l-). 

The expression for the natural extension E_q of Q, applicable when the above condition 
holds, can also be simplified significantly, again because of the special properties of Q 
and J^: ~ 



Egih) = sup |inf[/z - ^ X,[g,-Q{g,)]\ : n > 0,A, > 0,g, G 

= sup{inf[/2-?+e(l)] :^e^(^^)} 
= sup{e(g + inf[/z-j]): 

= snp{Qig):g<h,ge^{^,i)} 

= sup[Q"^.{g):g<h,ge^i.^,^)], 

for all gambles h on The point-wise smallest extension of P'l^- to a coherent ex- 

changeable model on (,^"+'^) is then the coherent exchangeable lower prevision with 
count distribution E_q, because of Theorem]?] 

In the well-known case that P"^- is a linear prevision P'^-, and therefore Q"^ is also a 
hnear prevision Q"^^, the condition for extendibility can also be written as 

min P{g\n) < Q"^{g) for all g e if (^1), 

where on the left hand side we now see the lower prevision of the gamble g, associated 
with drawing n balls from an urn with n + k balls, of unknown composition. When this 
is satisfied, the lower prevision Q will actually be a linear prevision Q on the linear space 
Jif, and £g will be the lower envelope of all linear previsions Q"^- on ^{.yV^'') that 



^"To see this, consider the polynomial p = ^^„+i h()l)Bfi. Use Zhou's formula [Equation 1221 in the 
Appendix] to find that if h = g, then also p = Y,me.jV^ si™)J^m, and consider that expansions in a Bernstein basis 
are unique. 
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extend Q. Similarly, the exchangeable natural extension will be the lower envelope of all 

"'^'^ on ( that extend P^. 

8. Conclusions 



the exchangeable hnear previsions PV^'' on (,^"+*) that extend P%^. 



We have shown that the notion of exchangeability has a natural place in the theory of 
coherent lower previsions. Indeed, on our approach using Bernstein polynomials, and gam- 
bles rather than events, it seems fairly natural and easy to derive representation theorems 
directly for coherent lower previsions, and to derive the corresponding results for precise 
probabilities (linear previsions) as special cases. 

Interesting results can also obtained in a context of predictive inference, where a co- 
herent exchangeable lower prevision for n + k variables is updated with the information 
that the first n variables have been ob served to assume certain values. For a fairly detailed 
discussion of these issues, we refer to De Co oman and Mirandal(l2007 l. Section 9.3). 

In Section|6l we have argued that the sample means , . . . converge in dis- 

tribution. It is possible (and quite easy for that matter) to prove strong er results. Indeed , 
using an approach that is completely similar to the one originally used by de Finettil ( 1937 ), 
we can prove that for all non-negative n and p: 

PliiSn+pif) S„{f)f) < sup/. 

n(n+p) 

In other words, for any fixed p > I, the sequence Sn+p(f ) — S„{f) 'converges in mean- 
square' to zero as n ^ oo. Even stronger, we find that for any non-negative k and I 

?5,.([5,(/)-5,(/)]')<2Msup/2, 

and therefore the sequence S„{f ) 'Cauchy-converges in mean-square'. These convergence 
results can also be used to derive the convergence in distribution of the S„{ f), but we 
consider the approach using Bernstein polynomials to be distinctly more elegant. 
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Appendix A. Multivariate Bernstein polynomials 

With any n > and m S there corresponds a Bernstein (basis) polynomial of de- 
gree n on Ej; , given by B„,{6) — v(in)n,:G,f ^"'S € E.-r. These polyno mials have a 



number of very interesting properties (see for instance iPrautzsch et al.Ll2002l Chapters 10 
and 1 1), which we list here: 

Bl. The set {Bm- m G of all Bernstein polynomials of fixed degree n is linearly 

independent: if Y.me^^ '^mSm — 0, then Am = for all m in 



B2. The set : m e } of all Bernstein polynomials of fixed degree n forms a parti- 
tion of unity: LmG..4 «- = 1- 

B3. All Bernstein basis polynomials are non-negative, and strictly positive in the interior 
ofEj;. 
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B4. The set {fim : m e } of all Bernstein polynomials of fixed degree n forms a basis 
for the linear space of all polynomials whose degree is at most n. 

Property B|4]follows from B[I]and B|2] It follows from B|4]that; 

B5. Any polynomial p of degree m has a unique expansion in terms of the Bernstein basis 
polynomials of fixed degree n > m, 

or in other words, there is a unique gamble b^, on such that 

b';,{m)B^^CoMn\{b"^\-). 

This tells us [also use B(2]and B(3| that each p{6) is a convex combination of the Bernstein 
coefficients b'pijn), m e ,yV^^ whence 

mmb"p < minp < p{6) < max/? < maxb"p. (21) 
It follows from a combination of B(2]and B|4lthat for all ^ > and all ju in ■yV^'^, 

E -('");(^-'") .;(^). (22) 



This is Zhou's formula (see iPrautzsch et all l2002l Section 11.9). Hence [let p = 1 and 

v(m)v(ju -m) 



use B|2l we find that for all > and all jU in 



v(M) 



1. (23) 



The expressions ( l22b and ( |23] ) also imply that each b"^'^[il) is a convex combination of 
the b"p{va), and therefore vamb"j^'^ > minb'p and maxb"p^^ < maxb"^. Combined with the 
inequalities in (l2Tl l. this leads to: 

[min p.maxp] C [min b"-,^'' , max ^7"+^'] C [min /j^ , max b",] (24) 

for all n > m and ^ > 0. This means that the non-decreasing sequence mmb'^ converges 
to some real number not greater than min p, and, similarly, the non-increasing sequence 
ma\b"p converges to some real number not smaller than max p. The following proposition 
strengthens this. 

Proposition 8. For any polynomial p on E of degree m, 

Um [minfep,max/7",] — [min/?, max/?] = p{I.^-). 

Proof. This follows fro m the fact that the converge uniformly to the polynomial p as 
w — > °°; see for instan ce iTrump and Prautzsch (Il996l) . Alternatively, it can be shown (see 
Prautzsch et alll2002 [ Section 1 1.9) that for n > m 



' ' n n n n 



From this, we deduce that minbp > minp + for any « > m, and as a consequence 
lim„^oo,n>mmin/7p > min/?. If we use now Equation ( l24l i. we see that lim„^„,„>mtn^nb"p = 
min p. The proof of the other equality is completely analogous. □ 
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